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ABSTRACT 

The Mid-South Clinical Data Research Network (CDRN) 
encompasses three large health systems: (1) Vanderbilt 
Health System (VU) with electronic medical records for 
over 2 million patients, (2) the Vanderbilt Healthcare 
Affiliated Network (VHAN) which currently includes over 
40 hospitals, hundreds of ambulatory practices, and over 
3 million patients in the Mid-South, and (3) Greenway 
Medical Technologies, with access to 24 million patients 
nationally. Initial goals of the Mid-South CDRN include: 
(1) expansion of our VU data network to include the 
VHAN and Greenway systems, (2) developing data 
integration/interoperability across the three systems, 
(3) improving our current tools for extracting clinical 
data, (4) optimization of tools for collection of patient- 
reported data, and (5) expansion of clinical decision 
support. By 18 months, we anticipate our CDRN will 
robustly support projects in comparative effectiveness 
research, pragmatic clinical trials, and other key research 
areas and have the capacity to share data and health 
information technology tools nationally. 



INTRODUCTION 

The Mid-South Clinical Data Research Network 
(CDRN) will create a large research network to 
support pragmatic trials and comparative effective- 
ness research. The Mid-South CDRN will connect 
three major health system networks: (1) the 
Vanderbilt University Health System (VU), which 
currently includes electronic medical records for 
over 2 million patients, (2) a growing Vanderbilt 
Healthcare Affiliated Network (VHAN), which cur- 
rently includes over 40 hospitals and hundreds of 
ambulatory practices, and will cover over 3 million 
patients in the Mid-South region, and (3) ambula- 
tory practices served by Greenway Medical 
Technologies, covering over 24 million patients 
across the country. The Mid-South CDRN will 
leverage current infrastructure, health information 
technologies, and data standards to connect the 
three health systems (see figure 1). The Mid-South 
CDRN will have a broad reach that includes a 
diverse population of patients across a large geo- 
graphic region (see figure 2). 

OVERVIEW OF EXISTING CLINICAL SYSTEMS 

The primary objective of creating the Mid-South 
CDRN is to permit research across numerous sites 
of healthcare delivery through the Southeast USA. 
To accomplish this, the Mid-South CDRN will 
accommodate the diverse health information tech- 
nologies installed at participating sites, and accept 
the different data formats they produce. We will 
develop connections among the installed technolo- 
gies in the three major health systems, described 
below. 



The Vanderbilt Health System (VU) network uti- 
lizes a comprehensive electronic health record 
(EHR) system called StarPanel. 1 Developed at 
Vanderbilt, StarPanel is an integrated web-based 
user interface to a number of clinically facing tools, 
such as clinical documentation systems, communi- 
cation tools supporting secure provider-to-provider 
and patient-to-provider messaging, reminders, 
alerts, management of work queues, and notifica- 
tion of new results. StarPanel is used throughout 
Vanderbilt and has been in place for over 15 years, 
with peak usage routinely exceeding 8000 concur- 
rent sessions. Because StarPanel is the only system 
used across the VU network, all clinical data gener- 
ated during patient care are immediately available 
to any other Vanderbilt provider. Underlying 
StarPanel are a data abstraction, a data aggregation, 
and a data storage layer, collectively called 
StarChart. StarChart additionally connects and can 
integrate data from numerous disparate health 
information technologies, such as commercial EHR 
systems installed elsewhere. 2 3 StarChart accepts 
patient data in diverse forms and integrates them 
through a common header format that identifies 
the patient, the nature of the report, who generated 
the report at what time, and demographic data. 
Data in StarChart are encoded to external stan- 
dards where feasible (see table 1). 

The VHAN is a clinically integrated network, 
chartered by the state of Tennessee and managed 
through a formal board structure and sub- 
committees. VHAN is currently composed of seven 
health systems that include over 40 hospitals and 
400 ambulatory practices, covering an estimated 
3000 clinicians ranging from ambulatory care to 
advanced subspecialties, with an estimated reach of 
over 3 million patients in the Mid-South area. The 
VHAN is a collection of different healthcare sites, 
each with its own installed EHR and health infor- 
mation technology (HIT) tools. Installed EHR 
systems across VHAN will not be standardized to 
one technology, and therefore VHAN represents 
the full spectrum of EHR adoption and utilization 
diversity. Given multiple EHR platforms, VHAN 
will encourage the use of standards and integration 
technologies to allow affiliate healthcare providers 
access to information across the network, with each 
provider still using a locally installed EHR system. 
As the VHAN Affiliate Exchange Infrastructure 
becomes more established and network sites 
become more sophisticated, VHAN will provide 
information exchange across the network as a 
whole. Data exchange will start with administrative 
and structured clinical data and progress to include 
clinical narratives and summaries. 

Greenway Medical Technologies provides inte- 
grated EHR and practice management software and 
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Figure 1 Mid-South Clinical Data Research Network (CDRN) systems and resources. VU, Vanderbilt University; IMPH, Vanderbilt Institute for 
Medicine and Public Health; CHSSR, Vanderbilt Center for Health Services Research; VICTR, Vanderbilt Institute for Clinical and Translational 
Research; REDCap, Research Electronic Data Capture; IRB, Institutional Research Board; CTSA, Clinical and Translational Science Award Program; 
eMERGE, Electronic Medical Records and Genomics Network; VHAN, Vanderbilt Health Affiliate Network; VA, Veterans Administration; CMS, Centers 
for Medicare & Medicaid Services. 



services to over 2000 ambulatory care practices across the 
country. The PrimeSuite EHR includes the patient chart, numer- 
ous clinical tools, and content management pages, as well as 
practice management functionality. PrimeSuite's integrated plat- 
form supports a wide variety of primary care and specialty prac- 
tices including Federally Qualified Health Centers. The platform 
also supports a patient-centered medical home for facilities that 
utilize specific modules and reporting tools. Greenway uses 
defined syntactic, semantic, and vocabulary standards for stand- 
ardizing and normalizing data from other systems. Throughout 
the data build and exchange process of translation, protocol 
configuration, formatting, and end-point arrivals, Greenway 
technology supports CCD/CDA, XDS.b, PIX, PDQ, HL7v2, 
Direct XDR, and custom clinical content needs. Greenway is 
fostering Consolidated CDA (clinical document architecture), an 
emerging patient data standard that consolidates and accommo- 
dates existing CCD/CDA clinical data for transferring summary 
of care records from within a single source. A small number of 
VHAN practices use Greenway PrimeSuite as their EHR solu- 
tion, and so may become part of Prime RESEARCH. Any deci- 
sion about whether to target them first will be made during the 
infrastructure-building and governance-development process of 
rolling out the Mid-South CDRN. 

RESEARCH SYSTEMS OVERVIEW 

The planned technical infrastructure for the Mid-South CDRN 
is designed to maximize usage of agreed-upon standards where 
possible, and apply methods that have been well-established at 
Vanderbilt 3 to exchange data where standards adoption is not 
immediately feasible. Currently accepted standards include both 



document-level standards (ie, 'syntactic standards' such as CDA) 
and data formatting standards (ie, 'semantic standards'). 
Standards in use include CDIS (The Clinical Data Interchange 
Standards), 4 MedDRA (Medical Dictionary for Regulatory 
Activities), HL7 (Health Level 7), LOINC (Logical 
Observations: Identifiers, Names, Codes), and SNOMED CT 
(Systematized Nomenclature of Medicine Clinical Terms). 
Where structured and standardized content is not available, the 
technical infrastructure will leverage existing tools to extract 
information from unstructured documents. For example, text 
processing services will identify clinical attributes and pheno- 
types from corpora of narrative texts in support of clinical 
research. 5 This infrastructural approach has been used success- 
fully in VU and in a health information exchange program in 
West Tennessee. In addition to this basic infrastructure, VU has 
developed, operationalized, and disseminated a number of tech- 
nologies used around the country to support clinical research. 
These technologies will support data integration, streamline 
research activities, and standardize administrative processes 
across the Mid-South CDRN. 

The Vanderbilt Research Data Warehouse 5 mirrors all clinical 
information contained in the VU EHR system, administrative 
systems, and local research databases, and applies novel inform- 
atics tools to analyze these data (see figure 3). The Research 
Data Warehouse contains structured data and applies text pro- 
cessing algorithms to support research. 6-14 The Research Data 
Warehouse includes the Synthetic Derivative (SD), a fully 
de-identified database containing longitudinal clinical informa- 
tion derived from 2.2 million patients represented in 
Vanderbilt's EHR. The SD is routinely used as a stand-alone 
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Figure 2 Mid-South Clinical Data Research Network (CDRN) reach. The top map shows the location of Greenway PrimeRESEARCH sites 
participating in their research network. The bottom map shows Vanderbilt Health Affiliate Network (VHAN) partner hospital sites. 



resource, and has been employed to capture data on disease 
status, disease onset and progression, drug utilization, drug 
responses, instances of polypharmacy and multimorbidity, 
medical procedures, hospital and health system utilization, lon- 
gitudinal laboratory measures, vital signs, social characteristics, 



and health-related behaviors. The SD can also be used in con- 
junction with Vanderbilt's 175 000-sample DNA biologic reposi- 
tory called the BioVU biobank, to identify patient sets for 
genome-phenome analysis. 15-17 The Research Data Warehouse 
also includes the Research Derivative (RD), an identified 



Table 1 StarChart data standards 



Component 



Technical approach/data standard 



Business transactions 

Diagnostic imaging 

Laboratory 

Medications 

Pharmacy 

Providers 

Concept terminology 
Procedures 
Diagnoses 
Billing and claims 



HL7 (Health Level 7), ASC (Accredited Standards Committee) X12 
DiCOM (Digital Imaging and Communications in Medicine) 
LOINC (Logical Observations: Identifiers, Names, Codes) 
RX Norm, FDB (First Data Bank) 

CPDP (National Council for Prescription Drug Programs), NDC (National Drug Code) 
UPIN (Universal Physician Identifier Number) 

SNOMED-CT (Systematized Nomenclature of Medicine-Clinical Terms), UMLS (Unified Medical Language System) 
CPT (Current Procedural Terminology), HCPCS (Healthcare Common Procedure Coding System) 

ICD (International Classification of Diseases, versions 9 and 10), ICD-0 (International Classification of Diseases for Oncology), SNOMED-CT 
UB (Uniform Bill) 92, CMS (Center for Medicare and Medicaid Services) 1500 
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Figure 3 The Mid-South CDRN's research data warehousing model includes integrating data from multiple clinical systems into a common 
architecture supporting both de-identified and identified use cases. Beneficiaries include informatics methods development research teams (e.g. 
natural language processing, de-id and re-id) whose work translates into improving both research and clinical practice. Scientific investigators across 
the research enterprise also benefit through tools and services built on data warehousing platforms. 



database used for cohort identification and data extraction for 
research purposes. The infrastructure framework for the SD and 
RD databases, and the other related investigator self-service 
tools, will be expanded or replicated across health systems 
within the Mid-South CDRN. 

REDCap (Research Electronic Data Capture) is a 
Vanderbilt-developed, secure, web-based platform for building 
and managing online surveys and research databases. Since its 
original creation in 2004, REDCap has become a de facto stand- 
ard for clinical and translational research around the world, 
with use in over 98 000 studies engaging nearly 1000 academic 
and non-profit partner organizations across 75 countries. 18 19 
REDCap supports standardization and shared data management 
for networked clinical research, including comprehensive data 
management workflow and capacity. It also provides a user- 
friendly interface for data entry and validation, audit trails for 
tracking data manipulation, export procedures for common stat- 
istical packages, data quality checks and (if desired) data query 
functionality, and resolution workflow In addition, REDCap 
contains preliminary procedures for importing data from exter- 
nal sources such as EHRs, allowing clinical data to directly feed 
case report forms. 20 REDCap also offers a robust, flexible plat- 
form for straightforward definition of survey instruments and 
administration to patients or family members for detailed data 
collection. Over the past several years, we have recorded over 
691 000 surveys completed in our Vanderbilt-specific installa- 
tion of REDCap alone. 

Subject Locator is a researcher-facing tool that supports 
patient enrollment into clinical and translational research 
studies. By mapping study inclusion and exclusion criteria to 
computable rules, Subject Locator signals whenever a patient is 
deemed 'close to appropriate' for a study. Once a particular 
patient is flagged, triggering notifications (based on study 



requirements) are followed by participant contact and consent. 
A related tool, Record Counter, uses the Research Data 
Warehouse to support efficient feasibility testing based on 
counts of the number of possible subjects within the VU system. 
Record Counter applies a sophisticated search mechanism that 
allows for complex system queries and returns counts of records 
stratified by race, sex, and age. 

Research Match 21 is a disease-neutral, geographic-neutral 
research registry developed by our team that currently connects 
51 000 patient and family volunteers with 1900 researchers at 
87 participating medical centers across the country. The plat- 
form has proven effective for volunteer/researcher connectivity 
aimed at study recruitment. 22 We have also found Research 
Match volunteers to be receptive and eager to participate in 
research prioritization focus groups and patient/family 
surveys. 23 The Research Match consortium of diverse patients 
and research teams from around the country can play an 
important role in the planning and executing of operations and 
individual CDRN studies. 

IRBshare is a new shared institutional review board (IRB) 
review model for multi-site studies consisting of participating 
institutions utilizing shared review documents and a shared 
review process, supported by a centralized, secure web portal 
and the IRBshare Master Agreement. IRBshare is a national 
project, designed and managed at Vanderbilt, and is comprised 
of 35 participating institutions to date. ContractShare is an 
evolving initiative patterned after IRBShare that will streamline 
contracting processes for multi-site studies. The master contract 
has been drafted (collaboratively by ~25 Clinical and 
Translational Science Awards (CTSA) sites) and is under review 
with industry sponsors and other stakeholders. The model can 
be readily expanded to support the CDRNs in future work that 
requires multiple subcontracts across organizations. 
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Table 2 Proposed initial cohorts 



Cohort name 



Target population size Proposed membership sources 



Proposed data elements 



Rare disease cohort 



Over 400 patients with 
sickle cell disease (adult 
and pediatric) 



Weight status cohort 1 0 000 adult patients 



Network cohort of 
choice (coronary 
heart disease) 



1 0 000 adult patients 



Identification of patients in the VU and VHAN (systems 1 
and 2), and patients cared for at Vanderbilt Meharry 
Matthew Walker Center of Excellence in Sickle Cell 
Disease 

Identification of patients in the VU EHR (system 1), VHAN 
EHRs (system 2), and Greenway PrimeRESEARCH network 
(system 3) 



Identification of patients in the VU EHR (system 1), and if 
needed, VHAN EHRs (system 2) 



Patient reported willingness to participate in future clinical 
studies, emergency department hospitalizations, and current 
medications, solicitation of attitudes to decrease 
hospitalization, ED visits, and readmissions 
Body mass index (BMI), weight, height, blood pressure, 
presence of comorbidities (ICD codes), current medications, 
select laboratory measures (A1C, BG, liver tests) patient 
reported characteristics, and attitudes related to study 
participation. EMA of health behaviors in a subset of 1000 
patients 

Sociodemographic characteristics, health literacy, medication 
adherence, diet, exercise, tobacco use status, BMI, weight, 
height, blood pressure, presence of comorbidities (ICD 
codes), current medications, select laboratory measures 
(A1C, creatinine), attitudes related to study participation 



A1C, hemoglobin A1C; BG, blood glucose; ED, emergency department; EHR, electronic health record; EMA, ecological momentary assessment; ICD, International Classification of 
Diseases; VHAN, Vanderbilt Healthcare Affiliated Network; VU, Vanderbilt Health System. 



A clinical decision support (CDS) service will be a central com- 
ponent of our CDRN to embed research activities within the 
healthcare systems without disrupting the business of providing 
healthcare. The CDS service will enable interventional studies, 
including by providing randomization of interventions in the 
form of best practice formats of standard CDS types (eg, alerts, 
reminders, ordering support, guidelines, forms, templates). 24 

In addition to the tools described above, our CDRN plans to 
incorporate significant resources to support comparative effective- 
ness research and stakeholder engagement. The planned CDRN 
will benefit from engagement with the Vanderbilt Center for 
Health Services Research, which currently has over 120 faculty 
funded by over $50 million for annual funded research in com- 
parative effectiveness research, pragmatic clinical trials, health eco- 
nomics and decision sciences, health communication research, 
health disparities research, community-based participatory 
research, and implementation sciences. Authentic stakeholder 
involvement is critical to the successful implementation of patient- 
centered outcomes research and our CDRN. Our overarching 
goals for stakeholder engagement are to develop the infrastructure 
to facilitate the meaningful involvement of patients, families, and 
clinicians in all aspects of our CDRN, and cultivate a research 
environment that values input from stakeholders and respects sta- 
keholders' perceptions of the relevance and acceptability of 
research generally, and of specific research studies. 

During the first 18 months, we will demonstrate CDRN cap- 
acity through identification, recruitment, and data collection 
from three cohorts: (1) sickle cell disease, (2) coronary heart 
disease, and (3) a cohort focused on weight status (see table 2). 
Establishing these cohorts will demonstrate our ability to iden- 
tify and extract data from our EHR, reach patients through our 
web portal, collect patient-reported data, and link to genomic 
and other clinical data. 

SUMMARY 

Developing the Mid-South CDRN and the larger PCORNet 
research network represents an exciting and innovative oppor- 
tunity to collaboratively support comparative effectiveness 
research and pragmatic clinical trials, and to improve health for 
patients with both common and rare conditions. By the end of 
18 months, we anticipate the Mid-South CDRN will be posi- 
tioned to take on projects in comparative effectiveness research, 
pragmatic clinical trials, and other key research areas, and have 
the capacity to share data and HIT nationally. 
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